Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add UUID version 7 as the default guid generator #3249

Merged
merged 9 commits into from
Sep 1, 2024

Conversation

ChrisJollyAU
Copy link
Contributor

From discussion in #2909

This adds a GUID generator compatible with the UID Version 7.

Code is based on the Guid.CreateVersion7 function that is part of .Net 9. Since EFCore and EFCore.PG target .Net 8 we don't get that function by default

@roji
Copy link
Member

roji commented Aug 23, 2024

@ChrisJollyAU without going into the fine details, that looks good overall to me... @Timovzl can I ask you to sign off on this as well?

@Timovzl
Copy link

Timovzl commented Aug 26, 2024

@ChrisJollyAU Below is my implementation using an almost-pure copy of the .NET 9 code. The only deviations are the ref var resultClone line and the references to resultClone instead of result.

While the downside is that this needs the extra struct, we get to use the source code as directly as possible. I feel safest staying as close to that as possible. Let me know your thoughts.

public class NpgsqlGuidVersion7ValueGenerator : ValueGenerator<Guid>
{
    public override bool GeneratesTemporaryValues => false;

    public override Guid Next(EntityEntry entry) => BorrowedFromNet9.CreateVersion7(timestamp: DateTimeOffset.UtcNow);

    // Code borrowed from .NET 9 should be removed as soon as the target framework includes such code
    #region Borrowed from .NET 9

#pragma warning disable IDE0007 // Use implicit type -- Avoid changes to code borrowed from BCL

    // https://github.com/dotnet/runtime/blob/f402418aaed508c1d77e41b942e3978675183bfc/src/libraries/System.Private.CoreLib/src/System/Guid.cs
    private static class BorrowedFromNet9
    {
        private const byte Variant10xxMask = 0xC0;
        private const byte Variant10xxValue = 0x80;

        private const ushort VersionMask = 0xF000;
        private const ushort Version7Value = 0x7000;

        /// <summary>Creates a new <see cref="Guid" /> according to RFC 9562, following the Version 7 format.</summary>
        /// <returns>A new <see cref="Guid" /> according to RFC 9562, following the Version 7 format.</returns>
        /// <remarks>
        ///     <para>This uses <see cref="DateTimeOffset.UtcNow" /> to determine the Unix Epoch timestamp source.</para>
        ///     <para>This seeds the rand_a and rand_b sub-fields with random data.</para>
        /// </remarks>
        public static Guid CreateVersion7() => CreateVersion7(DateTimeOffset.UtcNow);

        /// <summary>Creates a new <see cref="Guid" /> according to RFC 9562, following the Version 7 format.</summary>
        /// <param name="timestamp">The date time offset used to determine the Unix Epoch timestamp.</param>
        /// <returns>A new <see cref="Guid" /> according to RFC 9562, following the Version 7 format.</returns>
        /// <exception cref="ArgumentOutOfRangeException"><paramref name="timestamp" /> represents an offset prior to <see cref="DateTimeOffset.UnixEpoch" />.</exception>
        /// <remarks>
        ///     <para>This seeds the rand_a and rand_b sub-fields with random data.</para>
        /// </remarks>
        public static Guid CreateVersion7(DateTimeOffset timestamp)
        {
            // NewGuid uses CoCreateGuid on Windows and Interop.GetCryptographicallySecureRandomBytes on Unix to get
            // cryptographically-secure random bytes. We could use Interop.BCrypt.BCryptGenRandom to generate the random
            // bytes on Windows, as is done in RandomNumberGenerator, but that's measurably slower than using CoCreateGuid.
            // And while CoCreateGuid only generates 122 bits of randomness, the other 6 bits being for the version / variant
            // fields, this method also needs those bits to be non-random, so we can just use NewGuid for efficiency.
            Guid result = Guid.NewGuid();

            // 2^48 is roughly 8925.5 years, which from the Unix Epoch means we won't
            // overflow until around July of 10,895. So there isn't any need to handle
            // it given that DateTimeOffset.MaxValue is December 31, 9999. However, we
            // can't represent timestamps prior to the Unix Epoch since UUIDv7 explicitly
            // stores a 48-bit unsigned value, so we do need to throw if one is passed in.

            long unix_ts_ms = timestamp.ToUnixTimeMilliseconds();
            ArgumentOutOfRangeException.ThrowIfNegative(unix_ts_ms, nameof(timestamp));

            ref var resultClone = ref Unsafe.As<Guid, GuidDoppleganger>(ref result); // Deviation from BLC: Reinterpret Guid as our own type so that we can manipulate its private fields

            Unsafe.AsRef(in resultClone._a) = (int)(unix_ts_ms >> 16);
            Unsafe.AsRef(in resultClone._b) = (short)(unix_ts_ms);

            Unsafe.AsRef(in resultClone._c) = (short)((resultClone._c & ~VersionMask) | Version7Value);
            Unsafe.AsRef(in resultClone._d) = (byte)((resultClone._d & ~Variant10xxMask) | Variant10xxValue);

            return result;
        }
    }

    /// <summary>
    /// Used to manipulate the private fields of a <see cref="Guid"/> like its internal methods do, by treating a <see cref="Guid"/> as a <see cref="GuidDoppleganger"/>.
    /// </summary>
    [StructLayout(LayoutKind.Sequential)]
    internal readonly struct GuidDoppleganger
    {
#pragma warning disable IDE1006 // Naming Styles -- Avoid further changes to code borrowed from BCL when working with the current type
        internal readonly int _a;   // Do not rename (binary serialization)
        internal readonly short _b; // Do not rename (binary serialization)
        internal readonly short _c; // Do not rename (binary serialization)
        internal readonly byte _d;  // Do not rename (binary serialization)
        internal readonly byte _e;  // Do not rename (binary serialization)
        internal readonly byte _f;  // Do not rename (binary serialization)
        internal readonly byte _g;  // Do not rename (binary serialization)
        internal readonly byte _h;  // Do not rename (binary serialization)
        internal readonly byte _i;  // Do not rename (binary serialization)
        internal readonly byte _j;  // Do not rename (binary serialization)
        internal readonly byte _k;  // Do not rename (binary serialization)
#pragma warning restore IDE1006 // Naming Styles
    }

#pragma warning restore IDE0007 // Use implicit type

    #endregion
}

/// Generates sequential <see cref="Guid" /> values according to the UUID version 7 specification.
/// Will be updated to use Guid.CreateVersion7 when available.
/// </summary>
public class UUid7ValueGenerator : ValueGenerator<Guid>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The .NET team has chosen to stick to avoid using multiple synonyms, hence Guid.CreateVersion7. For consistency, I would go with GuidVersion7ValueGenerator.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: All generators in the current package are prefixed with Npgsql. Should we follow that convention or not, @roji?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moreover, they are marked as internal APIs. Should we rather be consistent with them or with EF Core's own GuidValueGenerator?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EF's GuidValueGenerator is fully public since it's meant to be referenceable by external providers, i.e. it's infrastructure; whereas types inside EFCore.PG aren't... So I think it's indeed better to make the generator pubternal (by moving it into the Internal namespace) - we don't want people to start using it for whatever purpose, it's just an internal implementation detail of the provider.

Given that it would be pubternal, the naming doesn't matter too much. I have a slight preference for UUID7, since that's the definition of that particular format; the .NET type which holds UUIDs (of any sort) happens to be Guid, but the actual format is UUID7 I think. But no strong feelings.

Re the Npgsql prefix, no strong feelings there either (again, it would be a pubternal type anyway). Since there's nothing particular Npgsql-specific about it - it just generates UUID7s, I'd personally leave it out.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's have the prefix, for consistency with similar existing types in the package, and to reinforce the pubternal-ness some more.

src/EFCore.PG/ValueGeneration/UUid7ValueGenerator.cs Outdated Show resolved Hide resolved
@roji
Copy link
Member

roji commented Aug 26, 2024

Thanks for the additional implementation proposal @Timovzl. I don't have strong feels between your doppleganger reinterpret cast solution and @ChrisJollyAU - remember this is all planned to go away in a year, so the important thing is to have something that works here. @ChrisJollyAU any comment?

@ChrisJollyAU
Copy link
Contributor Author

Don't mind whichever way. I'm happy to adjust to @Timovzl suggestion. Practically its modifying the data as the int and short as part of a struct versus the individual bytes. Might end up a couple less instructions on the struct but probably won't make too much noticeable difference

@Timovzl
Copy link

Timovzl commented Aug 26, 2024

In the absence of thorough testing, my preference would be to stay as close as possible to the source implementation.

Whichever we pick, perhaps we can include a Theory that, for a handful of timestamps, asserts that the result is identical to the output of Guid.CreateVersion7(timestamp) (as a constant passed to the theory, obtained from .NET 9 by hand).

@ChrisJollyAU
Copy link
Contributor Author

@Timovzl I have this test

public void CustomUuid7Test()
        {
            DateTimeOffset dtoNow = DateTimeOffset.UtcNow;
            Guid net9internal = Guid.CreateVersion7(dtoNow);
            Guid custom = Next(dtoNow);
            var bytenet9 = net9internal.ToByteArray().AsSpan(0, 6);
            var bytecustom = custom.ToByteArray().AsSpan(0,6);
            Assert.Equal(bytenet9,bytecustom);
            Assert.Equal(net9internal.Version,custom.Version);
            var t1 = net9internal.Variant & Variant10xxMask;
            var t2 = BitConverter.GetBytes(custom.Variant);
            Assert.InRange(net9internal.Variant,8,0xB);
            Assert.InRange(custom.Variant, 8, 0xB);
        }

@Timovzl
Copy link

Timovzl commented Aug 26, 2024

@ChrisJollyAU Nice. Looks like it needs .NET 9, though, so we might need to precalculate the net9internal value and parse that from a constant here, right?

@ChrisJollyAU
Copy link
Contributor Author

@Timovzl I must check how efcore.pg does the tests. I know my EFCore.Jet setup has the provider targeting .Net 8 but the tests target .Net 9

@roji
Copy link
Member

roji commented Aug 26, 2024

For testing, check out NpgsqlValueGeneratorSelectorTest - this is a unit test suite that would be the easiest/ideal place to place such a test.

Given we're all OK with both implementation options, I'd say let's go with @Timovzl suggestion as the closest to the .NET implementation (but once again am fine either way). Once the PR is updated and some minimal testing is added I'm happy to merge that for 9.0!

@ChrisJollyAU
Copy link
Contributor Author

@roji Thanks. Added the test there. As it happens your test project does target .Net 9 (preview 3) so we can use Guid.CreateVersion7

I've moved the value generator to the Internal folder under ValueGeneration so that it gets the .Internal added to its namespace

@Timovzl I needed to access the BorrowedFromNet9 from the test so I couldn't use it as private. Made it internal instead


namespace Npgsql.EntityFrameworkCore.PostgreSQL.ValueGeneration.Internal;

/// <summary>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Chore: Let's replace this summary, and the two below it, by the "pubternal" summary (copied from similar types elsewhere in the project):

/// <summary>
///     This API supports the Entity Framework Core infrastructure and is not intended to be used
///     directly from your code. This API may change or be removed in future releases.
/// </summary>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything else in that ValueGeneration folder looks like it uses that wording so maybe keep it like that to not be different to other value generators

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's true that the EF standard is to have the standard "pubternal" summary on everything - both the class and anything public/protected on it. This probably isn't applied consistently in the provider at the moment, and also isn't extremely important.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated summary description

/// Generates sequential <see cref="Guid" /> values according to the UUID version 7 specification.
/// Will be updated to use Guid.CreateVersion7 when available.
/// </summary>
public class NpgsqlUUid7ValueGenerator : ValueGenerator<Guid>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: The class name says UUid, which should be Uuid. The file name says UUID, which should also be Uuid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name of class changed, but filename change only worked locally

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yeah, that annoying issue. Add a digit or something, *2.cs, push, then rename to the desired casing and push again.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edit: If we add the Npgsql prefix, the casing change would also work.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, did you add the prefix without adjusting the casing, by any chance?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File name should be sorted now

@roji
Copy link
Member

roji commented Aug 30, 2024

@ChrisJollyAU is this ready for a final review?

@ChrisJollyAU
Copy link
Contributor Author

@roji Should be. Unless there's anything I've missed.

CI doesn't seem to be passing but if it matches what I have locally it is unrelated to this PR. (A compile time syntax error in test String_Join_non_aggregate and some errors in the DatabaseModelFactoryTest 42809: "BlogsView" is not a table)

Copy link
Member

@roji roji left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work here @ChrisJollyAU and @Timovzl - much appreciated!

Compilation errors were because the project was still using dotnet SDK 9.0.0-preview.3, which did not yet contain the new Guid.CreateVersion7() API - I pushed a commit to bump that to preview.7. I also commented out a test that's causing compilation issues, I believe the problem is already solved in later SDKs and will make sure to address before the release.

Other than that everything looks great, thanks!

@roji roji merged commit b91ef35 into npgsql:main Sep 1, 2024
15 checks passed
roji added a commit to roji/efcore.pg that referenced this pull request Sep 1, 2024
roji added a commit to roji/Npgsql.Doc that referenced this pull request Sep 1, 2024
roji added a commit to roji/Npgsql.Doc that referenced this pull request Sep 1, 2024
roji added a commit to roji/Npgsql.Doc that referenced this pull request Sep 1, 2024
roji added a commit that referenced this pull request Sep 1, 2024
roji added a commit to npgsql/doc that referenced this pull request Sep 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants